- 
                Notifications
    You must be signed in to change notification settings 
- Fork 13.4k
ggml: allow casting between f32 and i32 #15783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| 
 For C/C++, the behavior of float to int cast is to discard the fractional part, truncating the value towards zero. For negative values, this is not the same as  | 
| @slaren thanks, I've updated the test and comment to reflect this. according to the test, the behavior is currently the same on all backends | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A test that actually verifies that the cast produces the intended values (ie, 1.5->1, 1->1.0. etc) would be nice I guess.
| Yes that would be nice, and also be useful for many other ops. It can also act as examples for how to use certain ops. However, we need to adapt the code of  | 
| 
 Yeah, figured as much, another thing on the collective consciousness might-TODO-list. :) | 
| 
 I think we can implement this by setting the suitable values in  | 
| 
 Yes that's kinda what I'm doing, I set the range to  | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Vulkan change looks good,
| } else if (dst->type == GGML_TYPE_I32) { | ||
| size_t id = 0; | ||
| int32_t * dst_ptr = (int32_t *) dst->data; | ||
|  | ||
| for (int i03 = 0; i03 < ne03; i03++) { | ||
| for (int i02 = 0; i02 < ne02; i02++) { | ||
| id += ne00 * ir0; | ||
| for (int i01 = ir0; i01 < ir1; i01++) { | ||
| for (int i00 = 0; i00 < ne00; i00++) { | ||
| const float * src0_ptr = (float *) ((char *) src0->data + i00*nb00 + i01*nb01 + i02*nb02 + i03*nb03); | ||
|  | ||
| dst_ptr[id] = *src0_ptr; | ||
| id++; | ||
| } | ||
| } | ||
| id += ne00 * (ne01 - ir1); | ||
| } | ||
| } | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we merge this into the F32 branch above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes and indeed I also want to migrate some of these codes into template function. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, refactoring this code is welcome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll merge this PR as-is and will open another PR to refactor this code
Motivation:
ggml_argmaxorggml_top_kdue to missing an op to convert i32 --> f32Note: casting from f32 --> i32 will discard the fractional part
Planned to implement it on these backends:
test-backend-ops: